Synchronized Audio and Visual Decoding Scheme that is Tolerant to Variation of Processing Environment.

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Audio-visual speech fragment decoding

This paper presents a robust speech recognition technique called audio-visual speech fragment decoding (AV-SFD), in which the visual signal is exploited both as a cue for source separation and as a carrier of phonetic information. The model builds on the existing audio-only SFD technique which, based on the auditory scene analysis account of perceptual organisation, works by combining a bottom-...

متن کامل

Audio-visual signal processing in a multimodal assisted living environment

In this paper, we present some novel methods and applications for audio and video signal processing for a multimodal environment of an assisted living smart space. This intelligent environment was developed during the 7th Summer Workshop on Multimodal Interfaces eNTERFACE. It integrates automatic systems for audio and video-based monitoring and user tracking in the smart space. In the assisted ...

متن کامل

NSync: Fault-tolerant Synchronized Audio for Raspberry Pis

EQUIPPING a home with a distributed and synchronized audio system is currently a messy, expensive, and painful process. Rather than taking a traditionally expensive wired proprietary hardware approach, this paper presents the design and implementation of NSync, a distributed and synchronized audio system that leverages wireless communication amongst Raspberry Pis and commodity speakers. We impl...

متن کامل

Decoding representations of face identity that are tolerant to rotation.

In order to recognize the identity of a face we need to distinguish very similar images (specificity) while also generalizing identity information across image transformations such as changes in orientation (tolerance). Recent studies investigated the representation of individual faces in the brain, but it remains unclear whether the human brain regions that were found encode representations of...

متن کامل

Audio-visual Speech Processing

Speech is inherently bimodal, relying on cues from the acoustic and visual speech modalities for perception. The McGurk effect demonstrates that when humans are presented with conflicting acoustic and visual stimuli, the perceived sound may not exist in either modality. This effect has formed the basis for modelling the complementary nature of acoustic and visual speech by encapsulating them in...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: The Journal of the Institute of Image Information and Television Engineers

سال: 1998

ISSN: 1881-6908,1342-6907

DOI: 10.3169/itej.52.1055